Automatic Accent Identification Using Gaussian Mixture Models
نویسندگان
چکیده
It is well known that speaker variability caused by accent is an important factor in speech recognition. Some major accents in China are so different as to make this problem very severe. In this paper, we propose a Gaussian mixture model (GMM) based Mandarin accent identification method. In this method, a number of GMMs are trained to identify the most likely accent given test utterances. The identified accent type can be used to select an accent-dependent model for speech recognition. A multi-accent Mandarin corpus was developed for the task, including 4 typical accents in China with 1,440 speakers (1,200 for training, 240 for testing). We explore experimentally the effect of the number of components in GMM on identification performance. We also investigate how many utterances per speaker are sufficient to reliably recognize his/her accent. Finally, we show the correlations among accents and provide some discussions.
منابع مشابه
Modelling Accents for Automatic Speech Recognition
Accent is cited as an issue for speech recognition systems. If they are to be widely deployed, Automatic Speech Recognition (ASR) systems must deliver consistently high performance across user populations. Hence the development of accentrobust ASR is of significant importance. This research investigates techniques for compensating for the effects of accents on performance of Hidden Markov Model...
متن کاملEigen-channel compensation and discriminatively trained Gaussian mixture models for dialect and accent recognition
This paper presents a series of dialect/accent identification results for three sets of dialects with discriminatively trained Gaussian mixture models and feature compensation using eigen-channel decomposition. The classification tasks evaluated in the paper include: 1) the Chinese language classes, 2) American and Indian accented English and 3) discrimination between three Arabic dialects. The...
متن کاملSpeaker, Accent, and Language Identification Using Multilingual Phone Strings
Currently, approaches based on Gaussian Mixture Models (GMMs) [4] are the most widely and successfully used methods for speaker identification. Although GMMs have been applied successfully to close-speaking microphone scenarios under matched training and testing conditions, their performance degrades dramatically under mismatched conditions. The term “mismatched condition” describes a situation...
متن کاملAcoustic model selection for recognition of regional accented speech
Accent is cited as an issue for speech recognition systems [1]. Research has shown that accent mismatch between the training and the test data will result in significant accuracy reduction in Automatic Speech Recognition (ASR) systems. Using HMM based ASR trained on a standard English accent, our study shows that the error rates can be up to seven times higher for accented speech, than for stan...
متن کاملBeating Henry Higgins at His Own Game: A Markovian Approach to Dialectology
1. Introduction The performance of speech recognition algorithms degrades considerably due to speaker variability. Aside from gender, the largest cause for speaker variability is accent. If the accent of a speaker can be determined automatically, then accent-specific speech recognition models can be used, thereby increasing speech recognition accuracy. In this study, the problem of accent class...
متن کامل